library(topGO)
library(knitr) # to use kable()
library(limma) # to use vennDiagram()
library(ggplot2)
TAG <- params$TAG
VAR <- params$VAR
ANNOTATION <- list(genes = '../2019-07-26/genes/annotation.txt',
isoforms = '../2019-07-26/transcripts/annotation.txt')
TAGDIR <- paste0('../2020-01-08/', TAG, '/')
This is the enrichment analysis for genes regulated by regime. Because I quantified the association of expression with regime in two different ways, this document has two sections. Section Variance reports the enrichment analysis performed with genes ordered by the proportion of expression variance explained by regime. Note that some genes with low variance may still be highly associated with regime, even if the fold change between levels of this factor is low. Section Differential expression uses an ordering of genes based on the significance of the differential expression between levels of regime, which does depend on fold change.
Functional annotation is in 2019-07-26. I will also upload two lists of genes, with either proportion of variance explained by regime or p-value of differential expression test.
tagVariance <- read.table(paste0(TAGDIR, VAR, '_variance.txt'))
tagPValue <- read.table(paste0(TAGDIR, VAR, '_pvalue.txt'))
annotation <- read.table(ANNOTATION[[TAG]], col.names = c('tagname', 'goterms'))
To initialize the topGOdata object, I will need the gene list as a named numeric vector, where the names are the genes identifiers and the numeric values, either the portion of variance explained by regime or the p-values of the differential expression test. The structure() function adds attributes to an object.
Variance <- structure(tagVariance[,1], names = row.names(tagVariance))
PValues <- structure(tagPValue[,1], names = row.names(tagPValue))
rm(tagVariance, tagPValue)
There are 18598 genes scored with a variance portion and a differential expression p-value. It should actually be the exact same genes. The annotation data frame has more than one GO term for every tag, separated by |. I need a named list.
head(annotation)
## tagname goterms
## 1 XLOC_000002 GO:0008417|GO:0016020|GO:0006486
## 2 XLOC_000009 GO:0043130|GO:0005515|GO:0043161
## 3 XLOC_000010 GO:0005840|GO:0015935|GO:0003735|GO:0006412
## 4 XLOC_000015 GO:0006355|GO:0003700|GO:0006357|GO:0043565|GO:0005634|GO:0032502
## 5 XLOC_000021 GO:0016567|GO:0006397|GO:0061630|GO:0008270
## 6 XLOC_000036 GO:0005272|GO:0006814|GO:0016020
allgenes2GO <- strsplit(as.character(annotation$goterms), "|", fixed = TRUE)
names(allgenes2GO) <- annotation$tagname
rm(annotation)
There are 9691 genes with GO annotations. But the differential expression analysis includes many more genes. In order to include the not-annotated genes in the enrichment analysis, to see if annotated or not annotated genes are more or less often differentially expressed, I should attribute a GO term to them. According to [http://geneontology.org/docs/faq/] nowadays we express lack of annotation by annotating to the root nodes, i.e. GO:0008150 biological_process, GO:0003674 molecular_function, and GO:0005575 cellular_component.
for (tag in unique(c(names(PValues), names(Variance)))) {
if (! tag %in% names(allgenes2GO)) {
allgenes2GO <- append(allgenes2GO,
structure(list(c("GO:0008150", "GO:0003674", "GO:0005575")), names = tag))
}
}
Creation of a topGO dataset is documented in section 4 of topGO’s the user manual: https://bioconductor.org/packages/release/bioc/vignettes/topGO/inst/doc/topGO.pdf. I need to use the new command, and fill up the slots. The annot object must be a function that compiles “a list of GO terms such that each element in the list is a character vector containing all the gene identifiers that are mapped to the respective GO term.” There are several options, that you can check using help(annFUN.gene2GO), for example. The annFUN.gene2GO requires a gene2GO argument, which is the list of gene-to-GO terms I made before. The geneSelectionFun object is a function that selects the interesting (most significant) genes. It is required to perform Fisher’s exact test. The nodeSize is used to prune the GO hierarchy from the terms which have less than n annotated genes.
I generate three datasets, to analyse each of the three ontologies.
selection <- function(allScores) {return(allScores < 0.05)}
GOdata.BP <- new('topGOdata',
description = 'Differential expression among Brachionus plicatilis populations.',
ontology = 'BP',
allGenes = PValues,
annot = annFUN.gene2GO,
gene2GO = allgenes2GO,
geneSelectionFun = selection,
nodeSize = 5)
GOdata.MF <- new('topGOdata',
description = 'Differential expression among Brachionus plicatilis populations.',
ontology = 'MF',
allGenes = PValues,
annot = annFUN.gene2GO,
gene2GO = allgenes2GO,
geneSelectionFun = selection,
nodeSize = 5)
GOdata.CC <- new('topGOdata',
description = 'Differential expression among Brachionus plicatilis populations.',
ontology = 'CC',
allGenes = PValues,
annot = annFUN.gene2GO,
gene2GO = allgenes2GO,
geneSelectionFun = selection,
nodeSize = 5)
DataSummary <- data.frame(ontology = c('BP', 'MF', 'CC'),
Num_Genes = sapply(list(GOdata.BP, GOdata.MF, GOdata.CC), numGenes),
Num_GO_terms = sapply(list(GOdata.BP, GOdata.MF, GOdata.CC), function(x) length(usedGO(x))))
kable(DataSummary, caption='Number of feasible genes or transcripts and number of GO terms used in each data set.')
rm(DataSummary)
| ontology | Num_Genes | Num_GO_terms |
|---|---|---|
| BP | 15162 | 1154 |
| MF | 17429 | 576 |
| CC | 12940 | 291 |
There are more than one way to test for enrichment. Something that took me a while to understand is that not only there are different test statistics (Fisher’s exact test, Kolmogorov-Smirnov, and others) but also different algorithms: classic, elim, weight… The algorithms are ways to deal with the dependence structure among GO terms due to topology. Some algorithms are compatible with all statistics implemented in topGO. But the weight and the parentchild algorithms are only applicable to Fisher’s exact test. I am not interested in the classic algorithm, which treats GO nodes as independent, and therefore produces an excess of significant results. I will not use the Fisher’s exact test, because it dependes on an arbitrary threshold of significance on non-adjusted p-values.
BP.elim <- runTest(GOdata.BP, algorithm = "elim", statistic = "ks")
BP.weight01 <- runTest(GOdata.BP, algorithm = "weight01", statistic = "ks")
BP.lea <- runTest(GOdata.BP, algorithm = "lea", statistic = "ks")
MF.elim <- runTest(GOdata.MF, algorithm = "elim", statistic = "ks")
MF.weight01 <- runTest(GOdata.MF, algorithm = "weight01", statistic = "ks")
MF.lea <- runTest(GOdata.MF, algorithm = "lea", statistic = "ks")
CC.elim <- runTest(GOdata.CC, algorithm = "elim", statistic = "ks")
CC.weight01 <- runTest(GOdata.CC, algorithm = "weight01", statistic = "ks")
CC.lea <- runTest(GOdata.CC, algorithm = "lea", statistic = "ks")
ResultsSummary <- data.frame(ontology = rep(c("BP", "MF", "CC"), each = 3),
algorithm = rep(c("elim", "weight01", "lea"), 3),
TermsTested = sapply(list(BP.elim, BP.weight01, BP.lea, MF.elim, MF.weight01, MF.lea, CC.elim, CC.weight01, CC.lea), function(X) length(score(X))),
Significant = sapply(list(BP.elim, BP.weight01, BP.lea, MF.elim, MF.weight01, MF.lea, CC.elim, CC.weight01, CC.lea), function(X) sum(score(X) < 0.01)))
kable(ResultsSummary, caption="Number of non-trivial terms tested and those with a score (not corrected p-value) lower than 0.01.")
| ontology | algorithm | TermsTested | Significant |
|---|---|---|---|
| BP | elim | 1154 | 35 |
| BP | weight01 | 1154 | 23 |
| BP | lea | 1154 | 70 |
| MF | elim | 576 | 38 |
| MF | weight01 | 576 | 30 |
| MF | lea | 576 | 77 |
| CC | elim | 291 | 18 |
| CC | weight01 | 291 | 10 |
| CC | lea | 291 | 35 |
rm(ResultsSummary)
The topGO package does not pay much attention to what terms are significant because they are overrepresented and which ones are underrepresented among the most differentially expressed genes. I think it’s worth separating them, to facilitate the biological interpretation. Note that not all terms listed in the tables below are significant. The scores for the three methods (elim, weight01 and lea) are non-corrected p-values.
orderedTerms <- names(sort(score(BP.weight01)))
significant.weight01 <- score(BP.weight01)[orderedTerms] <= 0.01
significant.lea <- score(BP.lea)[orderedTerms] <= 0.01
significant.elim <- score(BP.elim)[orderedTerms] <= 0.01
sigTerms <- orderedTerms[significant.weight01 & significant.lea & significant.elim]
# sigTerms gets updated, and is already used in the code, but I need a permanent copy for later.
BP.pvalue.sigTerms <- sigTerms
BP.all <- GenTable(GOdata.BP, elim=BP.elim, weight01=BP.weight01, lea=BP.lea,
orderBy="weight01", ranksOf="elim", topNodes=sum(significant.elim))
kable(
BP.all[BP.all$Significant > BP.all$Expected,],
caption = "Most over-represented terms of the Biological Process ontology.")
| GO.ID | Term | Annotated | Significant | Expected | Rank in elim | elim | weight01 | lea | |
|---|---|---|---|---|---|---|---|---|---|
| 1 | GO:0055085 | transmembrane transport | 500 | 262 | 185.89 | 1 | 1.1e-12 | 4.0e-12 | 4.2e-16 |
| 2 | GO:0006508 | proteolysis | 442 | 220 | 164.33 | 2 | 2.3e-09 | 4.1e-08 | 5.9e-11 |
| 3 | GO:0006468 | protein phosphorylation | 276 | 133 | 102.61 | 3 | 2.0e-06 | 1.7e-06 | 2.0e-06 |
| 4 | GO:0055114 | oxidation-reduction process | 376 | 176 | 139.79 | 4 | 2.3e-05 | 1.4e-05 | 2.3e-05 |
| 5 | GO:0006289 | nucleotide-excision repair | 13 | 11 | 4.83 | 8 | 0.00047 | 0.00047 | 0.00047 |
| 6 | GO:0007186 | G protein-coupled receptor signaling pat… | 206 | 93 | 76.59 | 10 | 0.00092 | 0.00104 | 0.00092 |
| 7 | GO:0007160 | cell-matrix adhesion | 16 | 11 | 5.95 | 13 | 0.00116 | 0.00116 | 0.00116 |
| 8 | GO:0007165 | signal transduction | 544 | 233 | 202.25 | 30 | 0.00762 | 0.00136 | 5.6e-05 |
| 9 | GO:0006813 | potassium ion transport | 56 | 28 | 20.82 | 6 | 0.00039 | 0.00137 | 0.00039 |
| 10 | GO:0006355 | regulation of transcription, DNA-templat… | 355 | 158 | 131.98 | 5 | 0.00016 | 0.00154 | 0.00016 |
| 11 | GO:0006470 | protein dephosphorylation | 58 | 31 | 21.56 | 7 | 0.00045 | 0.00173 | 0.00045 |
| 12 | GO:0034220 | ion transmembrane transport | 147 | 74 | 54.65 | 11 | 0.00096 | 0.00205 | 0.00096 |
| 13 | GO:0005992 | trehalose biosynthetic process | 6 | 5 | 2.23 | 17 | 0.00330 | 0.00330 | 0.00330 |
| 14 | GO:1901642 | nucleoside transmembrane transport | 7 | 6 | 2.60 | 18 | 0.00360 | 0.00360 | 0.00360 |
| 15 | GO:0006979 | response to oxidative stress | 26 | 14 | 9.67 | 19 | 0.00372 | 0.00372 | 0.00372 |
| 16 | GO:0044271 | cellular nitrogen compound biosynthetic … | 783 | 300 | 291.11 | 891 | 0.73551 | 0.00415 | 0.84980 |
| 17 | GO:0043161 | proteasome-mediated ubiquitin-dependent … | 45 | 23 | 16.73 | 23 | 0.00533 | 0.00533 | 0.00533 |
| 18 | GO:0048870 | cell motility | 14 | 8 | 5.20 | 189 | 0.06783 | 0.00697 | 0.06783 |
| 20 | GO:0003341 | cilium movement | 10 | 8 | 3.72 | 31 | 0.00809 | 0.00809 | 0.00809 |
| 21 | GO:0060271 | cilium assembly | 46 | 28 | 17.10 | 9 | 0.00058 | 0.00831 | 0.00058 |
| 22 | GO:0051726 | regulation of cell cycle | 43 | 19 | 15.99 | 415 | 0.22696 | 0.00915 | 0.36822 |
| 23 | GO:0046942 | carboxylic acid transport | 20 | 14 | 7.44 | 24 | 0.00537 | 0.00937 | 0.00537 |
| 24 | GO:0006367 | transcription initiation from RNA polyme… | 13 | 9 | 4.83 | 36 | 0.01041 | 0.01041 | 0.01041 |
| 25 | GO:0007017 | microtubule-based process | 195 | 97 | 72.50 | 34 | 0.00889 | 0.01181 | 0.00149 |
| 26 | GO:0008272 | sulfate transport | 13 | 8 | 4.83 | 41 | 0.01226 | 0.01226 | 0.01226 |
| 27 | GO:0007156 | homophilic cell adhesion via plasma memb… | 21 | 10 | 7.81 | 43 | 0.01239 | 0.01239 | 0.01239 |
| 28 | GO:0006383 | transcription by RNA polymerase III | 8 | 7 | 2.97 | 47 | 0.01335 | 0.01335 | 0.01335 |
| 29 | GO:0043628 | ncRNA 3’-end processing | 6 | 4 | 2.23 | 48 | 0.01402 | 0.01402 | 0.01402 |
| 30 | GO:0016998 | cell wall macromolecule catabolic proces… | 9 | 6 | 3.35 | 49 | 0.01435 | 0.01435 | 0.01435 |
| 31 | GO:0006032 | chitin catabolic process | 9 | 6 | 3.35 | 50 | 0.01435 | 0.01435 | 0.01435 |
| 32 | GO:0006241 | CTP biosynthetic process | 5 | 5 | 1.86 | 55 | 0.01470 | 0.01470 | 0.01470 |
| 33 | GO:0046039 | GTP metabolic process | 5 | 5 | 1.86 | 56 | 0.01470 | 0.01470 | 0.01470 |
| 34 | GO:0046129 | purine ribonucleoside biosynthetic proce… | 5 | 5 | 1.86 | 57 | 0.01470 | 0.01470 | 0.01470 |
| 35 | GO:0006821 | chloride transport | 8 | 6 | 2.97 | 69 | 0.01633 | 0.01633 | 0.01633 |
kable(
BP.all[BP.all$Significant < BP.all$Expected,],
caption = "Most under-represented terms of the Biological Process ontology.")
| GO.ID | Term | Annotated | Significant | Expected | Rank in elim | elim | weight01 | lea | |
|---|---|---|---|---|---|---|---|---|---|
| 19 | GO:1901566 | organonitrogen compound biosynthetic pro… | 432 | 139 | 160.61 | 678 | 0.49516 | 0.00754 | 0.84012 |
vennDiagram(vennCounts(cbind(weight01=significant.weight01,
lea=significant.lea,
elim=significant.elim)))
kable(data.frame(Term = Term(GOTERM[sigTerms]),
Definition = Definition(GOTERM[sigTerms]),
PValue=score(BP.weight01)[sigTerms]),
caption = paste('Biological process terms significantly associated with',
VAR, 'according to all 3 algorithms', sep=' '))
| Term | Definition | PValue | |
|---|---|---|---|
| GO:0055085 | transmembrane transport | The process in which a solute is transported across a lipid bilayer, from one side of a membrane to the other. | 0.0000000 |
| GO:0006508 | proteolysis | The hydrolysis of proteins into smaller polypeptides and/or amino acids by cleavage of their peptide bonds. | 0.0000000 |
| GO:0006468 | protein phosphorylation | The process of introducing a phosphate group on to a protein. | 0.0000017 |
| GO:0055114 | oxidation-reduction process | A metabolic process that results in the removal or addition of one or more electrons to or from a substance, with or without the concomitant removal or addition of a proton or protons. | 0.0000145 |
| GO:0006289 | nucleotide-excision repair | A DNA repair process in which a small region of the strand surrounding the damage is removed from the DNA helix as an oligonucleotide. The small gap left in the DNA helix is filled in by the sequential action of DNA polymerase and DNA ligase. Nucleotide excision repair recognizes a wide range of substrates, including damage caused by UV irradiation (pyrimidine dimers and 6-4 photoproducts) and chemicals (intrastrand cross-links and bulky adducts). | 0.0004708 |
| GO:0007186 | G protein-coupled receptor signaling pathway | A series of molecular signals that proceeds with an activated receptor promoting the exchange of GDP for GTP on the alpha-subunit of an associated heterotrimeric G-protein complex. The GTP-bound activated alpha-G-protein then dissociates from the beta- and gamma-subunits to further transmit the signal within the cell. The pathway begins with receptor-ligand interaction, or for basal GPCR signaling the pathway begins with the receptor activating its G protein in the absence of an agonist, and ends with regulation of a downstream cellular process, e.g. transcription. The pathway can start from the plasma membrane, Golgi or nuclear membrane (PMID:24568158 and PMID:16902576). | 0.0010380 |
| GO:0007160 | cell-matrix adhesion | The binding of a cell to the extracellular matrix via adhesion molecules. | 0.0011588 |
| GO:0007165 | signal transduction | The cellular process in which a signal is conveyed to trigger a change in the activity or state of a cell. Signal transduction begins with reception of a signal (e.g. a ligand binding to a receptor or receptor activation by a stimulus such as light), or for signal transduction in the absence of ligand, signal-withdrawal or the activity of a constitutively active receptor. Signal transduction ends with regulation of a downstream cellular process, e.g. regulation of transcription or regulation of a metabolic process. Signal transduction covers signaling from receptors located on the surface of the cell and signaling via molecules located within the cell. For signaling between cells, signal transduction is restricted to events at and within the receiving cell. | 0.0013602 |
| GO:0006813 | potassium ion transport | The directed movement of potassium ions (K+) into, out of or within a cell, or between cells, by means of some agent such as a transporter or pore. | 0.0013703 |
| GO:0006355 | regulation of transcription, DNA-templated | Any process that modulates the frequency, rate or extent of cellular DNA-templated transcription. | 0.0015381 |
| GO:0006470 | protein dephosphorylation | The process of removing one or more phosphoric residues from a protein. | 0.0017294 |
| GO:0034220 | ion transmembrane transport | A process in which an ion is transported across a membrane. | 0.0020476 |
| GO:0005992 | trehalose biosynthetic process | The chemical reactions and pathways resulting in the formation of trehalose, a disaccharide isomeric with sucrose and obtained from certain lichens and fungi. | 0.0033010 |
| GO:1901642 | nucleoside transmembrane transport | NA | 0.0036043 |
| GO:0006979 | response to oxidative stress | Any process that results in a change in state or activity of a cell or an organism (in terms of movement, secretion, enzyme production, gene expression, etc.) as a result of oxidative stress, a state often resulting from exposure to high levels of reactive oxygen species, e.g. superoxide anions, hydrogen peroxide (H2O2), and hydroxyl radicals. | 0.0037189 |
| GO:0043161 | proteasome-mediated ubiquitin-dependent protein catabolic process | The chemical reactions and pathways resulting in the breakdown of a protein or peptide by hydrolysis of its peptide bonds, initiated by the covalent attachment of ubiquitin, and mediated by the proteasome. | 0.0053345 |
| GO:0003341 | cilium movement | The directed, self-propelled movement of a cilium. | 0.0080902 |
| GO:0060271 | cilium assembly | The assembly of a cilium, a specialized eukaryotic organelle that consists of a filiform extrusion of the cell surface. Each cilium is bounded by an extrusion of the cytoplasmic membrane, and contains a regular longitudinal array of microtubules, anchored basally in a centriole. | 0.0083101 |
| GO:0046942 | carboxylic acid transport | The directed movement of carboxylic acids into, out of or within a cell, or between cells, by means of some agent such as a transporter or pore. Carboxylic acids are organic acids containing one or more carboxyl (COOH) groups or anions (COO-). | 0.0093728 |
I think the GO graph is useful to see the relationship among the significant terms. But too large graphs are impossible to read. I don’t know how to split the graph in meaningful subgraphs.
showSigOfNodes(GOdata.BP, score(BP.weight01),
firstSigNodes = sum(significant.elim),
wantedNodes = sigTerms)
## $dag
## A graphNEL graph with directed edges
## Number of Nodes = 229
## Number of Edges = 439
##
## $complete.dag
## [1] "A graph with 229 nodes."
This is just a example of the most significant GO term:
showGroupDensity(GOdata.BP, names(score(BP.weight01))[which.min(score(BP.weight01))])
orderedTerms <- names(sort(score(MF.weight01)))
significant.weight01 <- score(MF.weight01)[orderedTerms] <= 0.01
significant.lea <- score(MF.lea)[orderedTerms] <= 0.01
significant.elim <- score(MF.elim)[orderedTerms] <= 0.01
sigTerms <- orderedTerms[significant.weight01 & significant.lea & significant.elim]
# sigTerms gets updated, and is already used in the code, but I need a permanent copy for later.
MF.pvalue.sigTerms <- sigTerms
MF.all <- GenTable(GOdata.MF, elim=MF.elim, weight01=MF.weight01, lea=MF.lea,
orderBy="weight01", ranksOf="elim", topNodes=sum(significant.elim))
kable(
MF.all[MF.all$Significant > MF.all$Expected,],
caption = "Most over-represented terms of the Molecular Function ontology.")
| GO.ID | Term | Annotated | Significant | Expected | Rank in elim | elim | weight01 | lea |
|---|---|---|---|---|---|---|---|---|
| GO:0005515 | protein binding | 2101 | 934 | 796.69 | 1 | 8.0e-14 | 1.2e-15 | 5.5e-16 |
| GO:0005509 | calcium ion binding | 361 | 200 | 136.89 | 2 | 9.8e-12 | 9.8e-12 | 9.8e-12 |
| GO:0005525 | GTP binding | 198 | 96 | 75.08 | 4 | 6.1e-05 | 6.1e-05 | 6.1e-05 |
| GO:0004672 | protein kinase activity | 274 | 128 | 103.90 | 3 | 3.4e-05 | 7.9e-05 | 3.4e-05 |
| GO:0003774 | motor activity | 131 | 70 | 49.67 | 6 | 0.00028 | 0.00012 | 0.00028 |
| GO:0005524 | ATP binding | 786 | 335 | 298.05 | 5 | 0.00023 | 0.00023 | 0.00023 |
| GO:0022857 | transmembrane transporter activity | 583 | 286 | 221.07 | 48 | 0.01338 | 0.00029 | 2.3e-09 |
| GO:0004181 | metallocarboxypeptidase activity | 19 | 14 | 7.20 | 8 | 0.00033 | 0.00033 | 0.00033 |
| GO:0005507 | copper ion binding | 18 | 13 | 6.83 | 9 | 0.00043 | 0.00043 | 0.00043 |
| GO:0005200 | structural constituent of cytoskeleton | 17 | 11 | 6.45 | 10 | 0.00054 | 0.00054 | 0.00054 |
| GO:0016491 | oxidoreductase activity | 417 | 189 | 158.12 | 53 | 0.01737 | 0.00060 | 0.00347 |
| GO:0004252 | serine-type endopeptidase activity | 88 | 43 | 33.37 | 11 | 0.00087 | 0.00087 | 0.00087 |
| GO:0005230 | extracellular ligand-gated ion channel a… | 54 | 29 | 20.48 | 38 | 0.00987 | 0.00106 | 0.00987 |
| GO:0061630 | ubiquitin protein ligase activity | 27 | 17 | 10.24 | 14 | 0.00114 | 0.00114 | 0.00114 |
| GO:0008138 | protein tyrosine/serine/threonine phosph… | 23 | 16 | 8.72 | 18 | 0.00195 | 0.00195 | 0.00195 |
| GO:0003924 | GTPase activity | 138 | 68 | 52.33 | 20 | 0.00245 | 0.00245 | 0.00245 |
| GO:0004198 | calcium-dependent cysteine-type endopept… | 28 | 17 | 10.62 | 21 | 0.00250 | 0.00250 | 0.00250 |
| GO:0004601 | peroxidase activity | 34 | 19 | 12.89 | 39 | 0.01148 | 0.00296 | 0.01148 |
| GO:0042626 | ATPase activity, coupled to transmembran… | 86 | 49 | 32.61 | 13 | 0.00093 | 0.00354 | 0.00093 |
| GO:0005337 | nucleoside transmembrane transporter act… | 7 | 6 | 2.65 | 24 | 0.00386 | 0.00386 | 0.00386 |
| GO:0004222 | metalloendopeptidase activity | 73 | 41 | 27.68 | 25 | 0.00446 | 0.00446 | 0.00446 |
| GO:0016787 | hydrolase activity | 1382 | 655 | 524.05 | 22 | 0.00253 | 0.00474 | 4.2e-17 |
| GO:0016715 | oxidoreductase activity, acting on paire… | 14 | 10 | 5.31 | 26 | 0.00539 | 0.00539 | 0.00539 |
| GO:0016840 | carbon-nitrogen lyase activity | 8 | 7 | 3.03 | 28 | 0.00577 | 0.00577 | 0.00577 |
| GO:0008017 | microtubule binding | 62 | 35 | 23.51 | 29 | 0.00652 | 0.00652 | 0.00652 |
| GO:0005506 | iron ion binding | 72 | 40 | 27.30 | 16 | 0.00125 | 0.00663 | 0.00125 |
| GO:0030248 | cellulose binding | 7 | 6 | 2.65 | 30 | 0.00666 | 0.00666 | 0.00666 |
| GO:0005249 | voltage-gated potassium channel activity | 27 | 12 | 10.24 | 32 | 0.00749 | 0.00749 | 0.00749 |
| GO:0004930 | G protein-coupled receptor activity | 180 | 84 | 68.26 | 17 | 0.00179 | 0.00765 | 0.00179 |
| GO:0047631 | ADP-ribose diphosphatase activity | 6 | 5 | 2.28 | 34 | 0.00809 | 0.00809 | 0.00809 |
| GO:0030246 | carbohydrate binding | 51 | 31 | 19.34 | 40 | 0.01189 | 0.01012 | 0.00082 |
| GO:0016791 | phosphatase activity | 100 | 53 | 37.92 | 65 | 0.02349 | 0.01012 | 0.00079 |
| GO:0008092 | cytoskeletal protein binding | 144 | 76 | 54.60 | 37 | 0.00914 | 0.01042 | 0.00031 |
| GO:0008271 | secondary active sulfate transmembrane t… | 13 | 8 | 4.93 | 45 | 0.01317 | 0.01317 | 0.01317 |
| GO:0020037 | heme binding | 95 | 49 | 36.02 | 47 | 0.01331 | 0.01331 | 0.01331 |
| GO:0043565 | sequence-specific DNA binding | 146 | 61 | 55.36 | 91 | 0.04456 | 0.01433 | 0.04456 |
| GO:0004568 | chitinase activity | 9 | 6 | 3.41 | 50 | 0.01471 | 0.01471 | 0.01471 |
| GO:0004550 | nucleoside diphosphate kinase activity | 5 | 5 | 1.90 | 51 | 0.01607 | 0.01607 | 0.01607 |
kable(
MF.all[MF.all$Significant < MF.all$Expected,],
caption = "Most under-represented terms of the Molecular Function ontology.")
| GO.ID | Term | Annotated | Significant | Expected | Rank in elim | elim | weight01 | lea |
|---|
vennDiagram(vennCounts(cbind(weight01=significant.weight01,
lea=significant.lea,
elim=significant.elim)))
kable(data.frame(Term=Term(GOTERM[sigTerms]),
Definition=Definition(GOTERM[sigTerms]),
PValue=score(MF.weight01)[sigTerms]),
caption = paste('Molecular function terms significantly associated with', VAR,
'according to all 3 algorithms', sep=' '))
| Term | Definition | PValue | |
|---|---|---|---|
| GO:0005515 | protein binding | Interacting selectively and non-covalently with any protein or protein complex (a complex of two or more proteins that may include other nonprotein molecules). | 0.0000000 |
| GO:0005509 | calcium ion binding | Interacting selectively and non-covalently with calcium ions (Ca2+). | 0.0000000 |
| GO:0005525 | GTP binding | Interacting selectively and non-covalently with GTP, guanosine triphosphate. | 0.0000612 |
| GO:0004672 | protein kinase activity | Catalysis of the phosphorylation of an amino acid residue in a protein, usually according to the reaction: a protein + ATP = a phosphoprotein + ADP. | 0.0000787 |
| GO:0003774 | motor activity | Catalysis of the generation of force resulting either in movement along a microfilament or microtubule, or in torque resulting in membrane scission, coupled to the hydrolysis of a nucleoside triphosphate. | 0.0001188 |
| GO:0005524 | ATP binding | Interacting selectively and non-covalently with ATP, adenosine 5’-triphosphate, a universally important coenzyme and enzyme regulator. | 0.0002254 |
| GO:0004181 | metallocarboxypeptidase activity | Catalysis of the hydrolysis of C-terminal amino acid residues from a polypeptide chain by a mechanism in which water acts as a nucleophile, one or two metal ions hold the water molecule in place, and charged amino acid side chains are ligands for the metal ions. | 0.0003290 |
| GO:0005507 | copper ion binding | Interacting selectively and non-covalently with copper (Cu) ions. | 0.0004350 |
| GO:0005200 | structural constituent of cytoskeleton | The action of a molecule that contributes to the structural integrity of a cytoskeletal structure. | 0.0005415 |
| GO:0004252 | serine-type endopeptidase activity | Catalysis of the hydrolysis of internal, alpha-peptide bonds in a polypeptide chain by a catalytic mechanism that involves a catalytic triad consisting of a serine nucleophile that is activated by a proton relay involving an acidic residue (e.g. aspartate or glutamate) and a basic residue (usually histidine). | 0.0008730 |
| GO:0005230 | extracellular ligand-gated ion channel activity | Enables the transmembrane transfer of an ion by a channel that opens when a specific extracellular ligand has been bound by the channel complex or one of its constituent parts. | 0.0010573 |
| GO:0061630 | ubiquitin protein ligase activity | Catalysis of the transfer of ubiquitin to a substrate protein via the reaction X-ubiquitin + S -> X + S-ubiquitin, where X is either an E2 or E3 enzyme, the X-ubiquitin linkage is a thioester bond, and the S-ubiquitin linkage is an amide bond: an isopeptide bond between the C-terminal glycine of ubiquitin and the epsilon-amino group of lysine residues in the substrate or, in the linear extension of ubiquitin chains, a peptide bond the between the C-terminal glycine and N-terminal methionine of ubiquitin residues. | 0.0011377 |
| GO:0008138 | protein tyrosine/serine/threonine phosphatase activity | Catalysis of the reactions: protein serine + H2O = protein serine + phosphate; protein threonine phosphate + H2O = protein threonine + phosphate; and protein tyrosine phosphate + H2O = protein tyrosine + phosphate. | 0.0019535 |
| GO:0003924 | GTPase activity | Catalysis of the reaction: GTP + H2O = GDP + phosphate. | 0.0024524 |
| GO:0004198 | calcium-dependent cysteine-type endopeptidase activity | Catalysis of the hydrolysis of nonterminal peptide bonds in a polypeptide chain by a mechanism using a cysteine residue at the enzyme active center, and requiring the presence of calcium. | 0.0024962 |
| GO:0042626 | ATPase activity, coupled to transmembrane movement of substances | Catalysis of the reaction: ATP + H2O = ADP + phosphate, to directly drive the active transport of a substance across a membrane. | 0.0035393 |
| GO:0005337 | nucleoside transmembrane transporter activity | Enables the transfer of a nucleoside, a nucleobase linked to either beta-D-ribofuranose (ribonucleoside) or 2-deoxy-beta-D-ribofuranose, (a deoxyribonucleotide) from one side of a membrane to the other. | 0.0038616 |
| GO:0004222 | metalloendopeptidase activity | Catalysis of the hydrolysis of internal, alpha-peptide bonds in a polypeptide chain by a mechanism in which water acts as a nucleophile, one or two metal ions hold the water molecule in place, and charged amino acid side chains are ligands for the metal ions. | 0.0044621 |
| GO:0016787 | hydrolase activity | Catalysis of the hydrolysis of various bonds, e.g. C-O, C-N, C-C, phosphoric anhydride bonds, etc. Hydrolase is the systematic name for any enzyme of EC class 3. | 0.0047444 |
| GO:0016715 | oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen, reduced ascorbate as one donor, and incorporation of one atom of oxygen | Catalysis of an oxidation-reduction (redox) reaction in which hydrogen or electrons are transferred from reduced ascorbate and one other donor, and one atom of oxygen is incorporated into one donor. | 0.0053940 |
| GO:0016840 | carbon-nitrogen lyase activity | Catalysis of the release of ammonia or one of its derivatives, with the formation of a double bond or ring. Enzymes with this activity may catalyze the actual elimination of the ammonia, amine or amide, e.g. CH-CH(-NH-R) = C=CH- + NH2-R. Others, however, catalyze elimination of another component, e.g. water, which is followed by spontaneous reactions that lead to breakage of the C-N bond, e.g. L-serine ammonia-lyase (EC:4.3.1.17), so that the overall reaction is C(-OH)-CH(-NH2) = CH2-CO- + NH3, i.e. an elimination with rearrangement. The sub-subclasses of EC:4.3 are the ammonia-lyases (EC:4.3.1), lyases acting on amides, amidines, etc. (EC:4.3.2), the amine-lyases (EC:4.3.3), and other carbon-nitrogen lyases (EC:4.3.99). | 0.0057664 |
| GO:0008017 | microtubule binding | Interacting selectively and non-covalently with microtubules, filaments composed of tubulin monomers. | 0.0065179 |
| GO:0005506 | iron ion binding | Interacting selectively and non-covalently with iron (Fe) ions. | 0.0066340 |
| GO:0030248 | cellulose binding | Interacting selectively and non-covalently with cellulose. | 0.0066580 |
| GO:0005249 | voltage-gated potassium channel activity | Enables the transmembrane transfer of a potassium ion by a voltage-gated channel. A voltage-gated channel is a channel whose open state is dependent on the voltage across the membrane in which it is embedded. | 0.0074909 |
| GO:0004930 | G protein-coupled receptor activity | Combining with an extracellular signal and transmitting the signal across the membrane by activating an associated G-protein; promotes the exchange of GDP for GTP on the alpha subunit of a heterotrimeric G-protein complex. | 0.0076490 |
| GO:0047631 | ADP-ribose diphosphatase activity | Catalysis of the reaction: ADP-ribose + H2O = AMP + D-ribose 5-phosphate. | 0.0080946 |
showSigOfNodes(GOdata.MF, score(MF.weight01),
firstSigNodes = sum(significant.elim),
wantedNodes = sigTerms)
## $dag
## A graphNEL graph with directed edges
## Number of Nodes = 145
## Number of Edges = 195
##
## $complete.dag
## [1] "A graph with 145 nodes."
showGroupDensity(GOdata.MF, names(score(MF.weight01))[which.min(score(MF.weight01))])
orderedTerms <- names(sort(score(CC.weight01)))
significant.weight01 <- score(CC.weight01)[orderedTerms] <= 0.01
significant.lea <- score(CC.lea)[orderedTerms] <= 0.01
significant.elim <- score(CC.elim)[orderedTerms] <= 0.01
sigTerms <- orderedTerms[significant.weight01 & significant.lea & significant.elim]
# sigTerms gets updated, and is already used in the code, but I need a permanent copy for later.
CC.pvalue.sigTerms <- sigTerms
CC.all <- GenTable(GOdata.CC, elim=CC.elim, weight01=CC.weight01, lea=CC.lea,
orderBy="weight01", ranksOf="elim", topNodes=sum(significant.elim))
kable(
CC.all[CC.all$Significant > CC.all$Expected,],
caption = "Most over-represented terms of the Cellular Component ontology.")
| GO.ID | Term | Annotated | Significant | Expected | Rank in elim | elim | weight01 | lea |
|---|---|---|---|---|---|---|---|---|
| GO:0016021 | integral component of membrane | 885 | 427 | 316.73 | 1 | 1.2e-16 | 3.5e-15 | 1.3e-22 |
| GO:0016020 | membrane | 1565 | 714 | 560.09 | 3 | 9.0e-06 | 5.7e-06 | 1.9e-26 |
| GO:0005887 | integral component of plasma membrane | 118 | 61 | 42.23 | 2 | 8.0e-06 | 2.6e-05 | 6.8e-08 |
| GO:0016459 | myosin complex | 33 | 23 | 11.81 | 4 | 4.4e-05 | 4.4e-05 | 4.4e-05 |
| GO:0098800 | inner mitochondrial membrane protein com… | 21 | 14 | 7.52 | 11 | 0.00195 | 0.00017 | 0.00195 |
| GO:0008076 | voltage-gated potassium channel complex | 13 | 8 | 4.65 | 6 | 0.00077 | 0.00077 | 0.00077 |
| GO:0090575 | RNA polymerase II transcription factor c… | 23 | 17 | 8.23 | 8 | 0.00085 | 0.00100 | 0.00085 |
| GO:0005576 | extracellular region | 140 | 65 | 50.10 | 10 | 0.00191 | 0.00263 | 0.00191 |
| GO:0005886 | plasma membrane | 170 | 88 | 60.84 | 5 | 0.00064 | 0.00413 | 5.0e-11 |
| GO:0005634 | nucleus | 531 | 216 | 190.04 | 14 | 0.00711 | 0.00938 | 0.03353 |
| GO:0042555 | MCM complex | 10 | 6 | 3.58 | 19 | 0.01007 | 0.01007 | 0.01007 |
| GO:0005743 | mitochondrial inner membrane | 33 | 22 | 11.81 | 20 | 0.01131 | 0.01131 | 0.00020 |
| GO:0098796 | membrane protein complex | 145 | 71 | 51.89 | 37 | 0.02782 | 0.01285 | 0.00525 |
| GO:0034704 | calcium channel complex | 5 | 5 | 1.79 | 24 | 0.01552 | 0.01552 | 0.01552 |
| GO:0044430 | cytoskeletal part | 152 | 73 | 54.40 | 47 | 0.03285 | 0.01625 | 0.01538 |
| GO:0005929 | cilium | 27 | 18 | 9.66 | 21 | 0.01151 | 0.01627 | 0.00174 |
| GO:0016591 | RNA polymerase II, holoenzyme | 20 | 14 | 7.16 | 9 | 0.00172 | 0.01827 | 0.00172 |
| GO:0005681 | spliceosomal complex | 12 | 6 | 4.29 | 27 | 0.01870 | 0.01870 | 0.01870 |
kable(
CC.all[CC.all$Significant < CC.all$Expected,],
caption = "Most under-represented terms of the Cellular Component ontology.")
| GO.ID | Term | Annotated | Significant | Expected | Rank in elim | elim | weight01 | lea |
|---|
vennDiagram(vennCounts(cbind(weight01=significant.weight01,
lea=significant.lea,
elim=significant.elim)))
kable(data.frame(Term=Term(GOTERM[sigTerms]),
Definition=Definition(GOTERM[sigTerms]),
PValue=score(CC.weight01)[sigTerms]),
caption = paste('Cellular component terms significantly associated with', VAR,
'according to all 3 algorithms', sep=' '))
| Term | Definition | PValue | |
|---|---|---|---|
| GO:0016021 | integral component of membrane | The component of a membrane consisting of the gene products and protein complexes having at least some part of their peptide sequence embedded in the hydrophobic region of the membrane. | 0.0000000 |
| GO:0016020 | membrane | A lipid bilayer along with all the proteins and protein complexes embedded in it an attached to it. | 0.0000057 |
| GO:0005887 | integral component of plasma membrane | The component of the plasma membrane consisting of the gene products and protein complexes having at least some part of their peptide sequence embedded in the hydrophobic region of the membrane. | 0.0000258 |
| GO:0016459 | myosin complex | A protein complex, formed of one or more myosin heavy chains plus associated light chains and other proteins, that functions as a molecular motor; uses the energy of ATP hydrolysis to move actin filaments or to move vesicles or other cargo on fixed actin filaments; has magnesium-ATPase activity and binds actin. Myosin classes are distinguished based on sequence features of the motor, or head, domain, but also have distinct tail regions that are believed to bind specific cargoes. | 0.0000444 |
| GO:0098800 | inner mitochondrial membrane protein complex | Any protein complex that is part of the inner mitochondrial membrane. | 0.0001717 |
| GO:0008076 | voltage-gated potassium channel complex | A protein complex that forms a transmembrane channel through which potassium ions may cross a cell membrane in response to changes in membrane potential. | 0.0007746 |
| GO:0090575 | RNA polymerase II transcription factor complex | A transcription factor complex that acts at a regulatory region of a gene transcribed by RNA polymerase II. | 0.0010016 |
| GO:0005576 | extracellular region | The space external to the outermost structure of a cell. For cells without external protective or external encapsulating structures this refers to space outside of the plasma membrane. This term covers the host cell environment outside an intracellular parasite. | 0.0026307 |
| GO:0005886 | plasma membrane | The membrane surrounding a cell that separates the cell from its external environment. It consists of a phospholipid bilayer and associated proteins. | 0.0041281 |
showSigOfNodes(GOdata.CC, score(CC.weight01),
firstSigNodes = sum(significant.elim),
wantedNodes = sigTerms)
## $dag
## A graphNEL graph with directed edges
## Number of Nodes = 75
## Number of Edges = 132
##
## $complete.dag
## [1] "A graph with 75 nodes."
showGroupDensity(GOdata.CC, names(score(CC.weight01))[which.min(score(CC.weight01))])
I need to generate the topGO objects again, using the alternative gene ordering, based on the proportion of expression-level variance explained by regime. I miss a way to inform the topGOdata object that the score in this case is reversed, with respect to p-values: the higher it is, the more differentially expressed the gene is. To make sure that GO terms are tested in the same way than when using p-values, I will just reverse the proportion of variance explained by regime to its complement. Taking this into account, the subset of interesting genes (selection() function) must be defined as the lowest 10% scores, which are the 10% genes with largest portion of variance explained by regime.
selection <- function(allScores) {return(allScores <= quantile(allScores, probs = 0.10))}
GOdataVar.BP <- new('topGOdata',
description = 'Differential expression among Brachionus plicatilis populations.',
ontology = 'BP',
allGenes = 1.0 - Variance,
annot = annFUN.gene2GO,
gene2GO = allgenes2GO,
geneSelectionFun = selection,
nodeSize = 5)
GOdataVar.MF <- new('topGOdata',
description = 'Differential expression among Brachionus plicatilis populations.',
ontology = 'MF',
allGenes = 1.0 - Variance,
annot = annFUN.gene2GO,
gene2GO = allgenes2GO,
geneSelectionFun = selection,
nodeSize = 5)
GOdataVar.CC <- new('topGOdata',
description = 'Differential expression among Brachionus plicatilis populations.',
ontology = 'CC',
allGenes = 1.0 - Variance,
annot = annFUN.gene2GO,
gene2GO = allgenes2GO,
geneSelectionFun = selection,
nodeSize = 5)
library(knitr)
DataSummary <- data.frame(ontology = c('BP', 'MF', 'CC'),
Num_Genes = sapply(list(GOdataVar.BP, GOdataVar.MF, GOdataVar.CC), numGenes),
Num_GO_terms = sapply(list(GOdataVar.BP, GOdataVar.MF, GOdataVar.CC), function(x) length(usedGO(x))))
kable(DataSummary, caption='Number of feasible genes or transcripts and number of GO terms used in each data set.')
rm(DataSummary)
| ontology | Num_Genes | Num_GO_terms |
|---|---|---|
| BP | 15162 | 1154 |
| MF | 17429 | 576 |
| CC | 12940 | 291 |
BPvar.elim <- runTest(GOdataVar.BP, algorithm = "elim", statistic = "ks")
BPvar.weight01 <- runTest(GOdataVar.BP, algorithm = "weight01", statistic = "ks")
BPvar.lea <- runTest(GOdataVar.BP, algorithm = "lea", statistic = "ks")
MFvar.elim <- runTest(GOdataVar.MF, algorithm = "elim", statistic = "ks")
MFvar.weight01 <- runTest(GOdataVar.MF, algorithm = "weight01", statistic = "ks")
MFvar.lea <- runTest(GOdataVar.MF, algorithm = "lea", statistic = "ks")
CCvar.elim <- runTest(GOdataVar.CC, algorithm = "elim", statistic = "ks")
CCvar.weight01 <- runTest(GOdataVar.CC, algorithm = "weight01", statistic = "ks")
CCvar.lea <- runTest(GOdataVar.CC, algorithm = "lea", statistic = "ks")
ResultsSummary <- data.frame(ontology = rep(c("BP", "MF", "CC"), each = 3),
algorithm = rep(c("elim", "weight01", "lea"), 3),
TermsTested = sapply(list(BPvar.elim, BPvar.weight01, BPvar.lea, MFvar.elim, MFvar.weight01, MFvar.lea, CCvar.elim, CCvar.weight01, CCvar.lea), function(X) length(score(X))),
Significant = sapply(list(BPvar.elim, BPvar.weight01, BPvar.lea, MFvar.elim, MFvar.weight01, MFvar.lea, CCvar.elim, CCvar.weight01, CCvar.lea), function(X) sum(score(X) < 0.01)))
kable(ResultsSummary, caption="Number of non-trivial terms tested and those with a score (not corrected p-value) lower than 0.01.")
| ontology | algorithm | TermsTested | Significant |
|---|---|---|---|
| BP | elim | 1154 | 55 |
| BP | weight01 | 1154 | 34 |
| BP | lea | 1154 | 102 |
| MF | elim | 576 | 53 |
| MF | weight01 | 576 | 39 |
| MF | lea | 576 | 100 |
| CC | elim | 291 | 29 |
| CC | weight01 | 291 | 16 |
| CC | lea | 291 | 57 |
rm(ResultsSummary)
orderedTerms <- names(sort(score(BPvar.weight01)))
significant.weight01 <- score(BPvar.weight01)[orderedTerms] <= 0.01
significant.lea <- score(BPvar.lea)[orderedTerms] <= 0.01
significant.elim <- score(BPvar.elim)[orderedTerms] <= 0.01
sigTerms <- orderedTerms[significant.weight01 & significant.lea & significant.elim]
# sigTerms gets updated, and is already used in the code, but I need a permanent copy for later.
BP.variance.sigTerms <- sigTerms
BPvar.all <- GenTable(GOdataVar.BP, elim=BPvar.elim, weight01=BPvar.weight01, lea=BPvar.lea,
orderBy="weight01", ranksOf="elim", topNodes=sum(significant.elim))
kable(
BPvar.all[BPvar.all$Significant > BPvar.all$Expected,],
caption = "Most over-represented terms of the Biological Process ontology.")
| GO.ID | Term | Annotated | Significant | Expected | Rank in elim | elim | weight01 | lea | |
|---|---|---|---|---|---|---|---|---|---|
| 1 | GO:0055085 | transmembrane transport | 500 | 90 | 47.85 | 1 | 2.1e-12 | 3.1e-12 | 1.8e-16 |
| 2 | GO:0006508 | proteolysis | 442 | 74 | 42.30 | 2 | 4.9e-09 | 2.7e-09 | 2.0e-12 |
| 3 | GO:0007186 | G protein-coupled receptor signaling pat… | 206 | 27 | 19.71 | 4 | 1.2e-06 | 1.1e-06 | 1.2e-06 |
| 4 | GO:0006355 | regulation of transcription, DNA-templat… | 355 | 43 | 33.97 | 3 | 3.7e-07 | 6.9e-06 | 3.7e-07 |
| 5 | GO:0034220 | ion transmembrane transport | 147 | 21 | 14.07 | 7 | 6.9e-05 | 1.1e-05 | 6.9e-05 |
| 6 | GO:0006468 | protein phosphorylation | 276 | 39 | 26.41 | 6 | 1.1e-05 | 3.8e-05 | 1.1e-05 |
| 7 | GO:0006813 | potassium ion transport | 56 | 9 | 5.36 | 5 | 6.7e-06 | 0.00019 | 6.7e-06 |
| 8 | GO:0006511 | ubiquitin-dependent protein catabolic pr… | 102 | 12 | 9.76 | 8 | 0.00019 | 0.00019 | 2.3e-05 |
| 9 | GO:0006289 | nucleotide-excision repair | 13 | 5 | 1.24 | 9 | 0.00021 | 0.00021 | 0.00021 |
| 10 | GO:0055114 | oxidation-reduction process | 376 | 47 | 35.98 | 15 | 0.00094 | 0.00028 | 0.00094 |
| 11 | GO:0060271 | cilium assembly | 46 | 7 | 4.40 | 11 | 0.00054 | 0.00054 | 1.0e-05 |
| 12 | GO:0003341 | cilium movement | 10 | 2 | 0.96 | 13 | 0.00070 | 0.00070 | 0.00070 |
| 13 | GO:0005992 | trehalose biosynthetic process | 6 | 5 | 0.57 | 14 | 0.00074 | 0.00074 | 0.00074 |
| 15 | GO:0007165 | signal transduction | 544 | 61 | 52.06 | 23 | 0.00350 | 0.00102 | 5.4e-08 |
| 17 | GO:0006030 | chitin metabolic process | 74 | 13 | 7.08 | 10 | 0.00033 | 0.00165 | 0.00033 |
| 18 | GO:0048870 | cell motility | 14 | 5 | 1.34 | 67 | 0.01357 | 0.00181 | 0.01357 |
| 19 | GO:0006367 | transcription initiation from RNA polyme… | 13 | 5 | 1.24 | 20 | 0.00273 | 0.00273 | 0.00273 |
| 20 | GO:0051260 | protein homooligomerization | 25 | 4 | 2.39 | 22 | 0.00301 | 0.00301 | 0.00301 |
| 21 | GO:0007018 | microtubule-based movement | 115 | 13 | 11.01 | 16 | 0.00113 | 0.00358 | 2.2e-05 |
| 22 | GO:0007160 | cell-matrix adhesion | 16 | 7 | 1.53 | 24 | 0.00366 | 0.00366 | 0.00366 |
| 25 | GO:0016579 | protein deubiquitination | 40 | 6 | 3.83 | 31 | 0.00400 | 0.00400 | 0.00400 |
| 26 | GO:0046039 | GTP metabolic process | 5 | 1 | 0.48 | 33 | 0.00417 | 0.00417 | 0.00417 |
| 27 | GO:0046129 | purine ribonucleoside biosynthetic proce… | 5 | 1 | 0.48 | 34 | 0.00417 | 0.00417 | 0.00417 |
| 28 | GO:1901642 | nucleoside transmembrane transport | 7 | 2 | 0.67 | 35 | 0.00418 | 0.00418 | 0.00418 |
| 30 | GO:0006744 | ubiquinone biosynthetic process | 5 | 1 | 0.48 | 41 | 0.00564 | 0.00564 | 0.00564 |
| 31 | GO:0071805 | potassium ion transmembrane transport | 19 | 6 | 1.82 | 58 | 0.01074 | 0.00590 | 0.01074 |
| 32 | GO:0006520 | cellular amino acid metabolic process | 117 | 15 | 11.20 | 862 | 0.68061 | 0.00819 | 0.95060 |
| 33 | GO:0006241 | CTP biosynthetic process | 5 | 1 | 0.48 | 48 | 0.00881 | 0.00881 | 0.00881 |
| 34 | GO:0016539 | intein-mediated protein splicing | 5 | 1 | 0.48 | 55 | 0.00964 | 0.00964 | 0.00964 |
| 35 | GO:0005975 | carbohydrate metabolic process | 202 | 35 | 19.33 | 80 | 0.01684 | 0.01164 | 0.04834 |
| 36 | GO:0006470 | protein dephosphorylation | 58 | 7 | 5.55 | 17 | 0.00131 | 0.01234 | 0.00131 |
| 37 | GO:0006165 | nucleoside diphosphate phosphorylation | 16 | 3 | 1.53 | 389 | 0.21287 | 0.01247 | 0.21287 |
| 38 | GO:0009206 | purine ribonucleoside triphosphate biosy… | 27 | 4 | 2.58 | 499 | 0.29722 | 0.01248 | 0.29722 |
| 39 | GO:0000245 | spliceosomal complex assembly | 7 | 1 | 0.67 | 69 | 0.01366 | 0.01366 | 0.01366 |
| 40 | GO:0008285 | negative regulation of cell proliferatio… | 6 | 2 | 0.57 | 70 | 0.01381 | 0.01381 | 0.01381 |
| 43 | GO:0042073 | intraciliary transport | 11 | 2 | 1.05 | 294 | 0.14469 | 0.01624 | 0.14469 |
| 44 | GO:0042398 | cellular modified amino acid biosyntheti… | 17 | 3 | 1.63 | 638 | 0.42953 | 0.01723 | 0.42953 |
| 45 | GO:0007017 | microtubule-based process | 195 | 26 | 18.66 | 195 | 0.07365 | 0.01768 | 0.03148 |
| 46 | GO:0044237 | cellular metabolic process | 2395 | 244 | 229.20 | 623 | 0.40930 | 0.01828 | 1.3e-17 |
| 47 | GO:0006689 | ganglioside catabolic process | 5 | 2 | 0.48 | 87 | 0.01839 | 0.01839 | 0.01839 |
| 48 | GO:0006383 | transcription by RNA polymerase III | 8 | 1 | 0.77 | 95 | 0.01855 | 0.01855 | 0.01855 |
| 49 | GO:0009152 | purine ribonucleotide biosynthetic proce… | 44 | 5 | 4.21 | 537 | 0.32672 | 0.01929 | 0.32672 |
| 50 | GO:0006914 | autophagy | 21 | 3 | 2.01 | 30 | 0.00395 | 0.01940 | 0.00395 |
| 51 | GO:0006626 | protein targeting to mitochondrion | 9 | 1 | 0.86 | 83 | 0.01770 | 0.02082 | 0.01770 |
| 52 | GO:0006821 | chloride transport | 8 | 1 | 0.77 | 101 | 0.02154 | 0.02154 | 0.02154 |
| 53 | GO:0030163 | protein catabolic process | 123 | 12 | 11.77 | 21 | 0.00298 | 0.02225 | 1.8e-07 |
| 55 | GO:0006979 | response to oxidative stress | 26 | 8 | 2.49 | 111 | 0.02276 | 0.02276 | 0.02276 |
kable(
BPvar.all[BPvar.all$Significant < BPvar.all$Expected,],
caption = "Most under-represented terms of the Biological Process ontology.")
| GO.ID | Term | Annotated | Significant | Expected | Rank in elim | elim | weight01 | lea | |
|---|---|---|---|---|---|---|---|---|---|
| 14 | GO:0006412 | translation | 195 | 3 | 18.66 | 383 | 0.20869 | 0.00087 | 0.20869 |
| 16 | GO:0000398 | mRNA splicing, via spliceosome | 46 | 1 | 4.40 | 12 | 0.00070 | 0.00129 | 0.00070 |
| 23 | GO:0007156 | homophilic cell adhesion via plasma memb… | 21 | 1 | 2.01 | 26 | 0.00375 | 0.00375 | 0.00375 |
| 24 | GO:0043161 | proteasome-mediated ubiquitin-dependent … | 45 | 4 | 4.31 | 29 | 0.00388 | 0.00388 | 0.00388 |
| 29 | GO:0044271 | cellular nitrogen compound biosynthetic … | 783 | 73 | 74.93 | 215 | 0.08360 | 0.00453 | 0.22562 |
| 41 | GO:0035556 | intracellular signal transduction | 173 | 16 | 16.56 | 125 | 0.03154 | 0.01456 | 0.25702 |
| 42 | GO:0051726 | regulation of cell cycle | 43 | 4 | 4.12 | 270 | 0.12315 | 0.01487 | 0.12315 |
| 54 | GO:0070836 | caveola assembly | 7 | 0 | 0.67 | 103 | 0.02251 | 0.02251 | 0.02251 |
vennDiagram(vennCounts(cbind(weight01=significant.weight01,
lea=significant.lea,
elim=significant.elim)))
kable(data.frame(Term=Term(GOTERM[sigTerms]),
Definition=Definition(GOTERM[sigTerms]),
PValue=score(BPvar.weight01)[sigTerms]),
caption = paste('Biological process terms significantly associated with', VAR,
'according to all 3 algorithms', sep=' '))
| Term | Definition | PValue | |
|---|---|---|---|
| GO:0055085 | transmembrane transport | The process in which a solute is transported across a lipid bilayer, from one side of a membrane to the other. | 0.0000000 |
| GO:0006508 | proteolysis | The hydrolysis of proteins into smaller polypeptides and/or amino acids by cleavage of their peptide bonds. | 0.0000000 |
| GO:0007186 | G protein-coupled receptor signaling pathway | A series of molecular signals that proceeds with an activated receptor promoting the exchange of GDP for GTP on the alpha-subunit of an associated heterotrimeric G-protein complex. The GTP-bound activated alpha-G-protein then dissociates from the beta- and gamma-subunits to further transmit the signal within the cell. The pathway begins with receptor-ligand interaction, or for basal GPCR signaling the pathway begins with the receptor activating its G protein in the absence of an agonist, and ends with regulation of a downstream cellular process, e.g. transcription. The pathway can start from the plasma membrane, Golgi or nuclear membrane (PMID:24568158 and PMID:16902576). | 0.0000011 |
| GO:0006355 | regulation of transcription, DNA-templated | Any process that modulates the frequency, rate or extent of cellular DNA-templated transcription. | 0.0000069 |
| GO:0034220 | ion transmembrane transport | A process in which an ion is transported across a membrane. | 0.0000114 |
| GO:0006468 | protein phosphorylation | The process of introducing a phosphate group on to a protein. | 0.0000380 |
| GO:0006813 | potassium ion transport | The directed movement of potassium ions (K+) into, out of or within a cell, or between cells, by means of some agent such as a transporter or pore. | 0.0001859 |
| GO:0006511 | ubiquitin-dependent protein catabolic process | The chemical reactions and pathways resulting in the breakdown of a protein or peptide by hydrolysis of its peptide bonds, initiated by the covalent attachment of a ubiquitin group, or multiple ubiquitin groups, to the protein. | 0.0001901 |
| GO:0006289 | nucleotide-excision repair | A DNA repair process in which a small region of the strand surrounding the damage is removed from the DNA helix as an oligonucleotide. The small gap left in the DNA helix is filled in by the sequential action of DNA polymerase and DNA ligase. Nucleotide excision repair recognizes a wide range of substrates, including damage caused by UV irradiation (pyrimidine dimers and 6-4 photoproducts) and chemicals (intrastrand cross-links and bulky adducts). | 0.0002075 |
| GO:0055114 | oxidation-reduction process | A metabolic process that results in the removal or addition of one or more electrons to or from a substance, with or without the concomitant removal or addition of a proton or protons. | 0.0002837 |
| GO:0060271 | cilium assembly | The assembly of a cilium, a specialized eukaryotic organelle that consists of a filiform extrusion of the cell surface. Each cilium is bounded by an extrusion of the cytoplasmic membrane, and contains a regular longitudinal array of microtubules, anchored basally in a centriole. | 0.0005423 |
| GO:0003341 | cilium movement | The directed, self-propelled movement of a cilium. | 0.0007025 |
| GO:0005992 | trehalose biosynthetic process | The chemical reactions and pathways resulting in the formation of trehalose, a disaccharide isomeric with sucrose and obtained from certain lichens and fungi. | 0.0007430 |
| GO:0007165 | signal transduction | The cellular process in which a signal is conveyed to trigger a change in the activity or state of a cell. Signal transduction begins with reception of a signal (e.g. a ligand binding to a receptor or receptor activation by a stimulus such as light), or for signal transduction in the absence of ligand, signal-withdrawal or the activity of a constitutively active receptor. Signal transduction ends with regulation of a downstream cellular process, e.g. regulation of transcription or regulation of a metabolic process. Signal transduction covers signaling from receptors located on the surface of the cell and signaling via molecules located within the cell. For signaling between cells, signal transduction is restricted to events at and within the receiving cell. | 0.0010235 |
| GO:0000398 | mRNA splicing, via spliceosome | The joining together of exons from one or more primary transcripts of messenger RNA (mRNA) and the excision of intron sequences, via a spliceosomal mechanism, so that mRNA consisting only of the joined exons is produced. | 0.0012854 |
| GO:0006030 | chitin metabolic process | The chemical reactions and pathways involving chitin, a linear polysaccharide consisting of beta-(1->4)-linked N-acetyl-D-glucosamine residues. | 0.0016545 |
| GO:0006367 | transcription initiation from RNA polymerase II promoter | Any process involved in the assembly of the RNA polymerase II preinitiation complex (PIC) at an RNA polymerase II promoter region of a DNA template, resulting in the subsequent synthesis of RNA from that promoter. The initiation phase includes PIC assembly and the formation of the first few bonds in the RNA chain, including abortive initiation, which occurs when the first few nucleotides are repeatedly synthesized and then released. Promoter clearance, or release, is the transition between the initiation and elongation phases of transcription. | 0.0027295 |
| GO:0051260 | protein homooligomerization | The process of creating protein oligomers, compounds composed of a small number, usually between three and ten, of identical component monomers. Oligomers may be formed by the polymerization of a number of monomers or the depolymerization of a large protein polymer. | 0.0030076 |
| GO:0007018 | microtubule-based movement | A microtubule-based process that results in the movement of organelles, other microtubules, or other cellular components. Examples include motor-driven movement along microtubules and movement driven by polymerization or depolymerization of microtubules. | 0.0035823 |
| GO:0007160 | cell-matrix adhesion | The binding of a cell to the extracellular matrix via adhesion molecules. | 0.0036621 |
| GO:0007156 | homophilic cell adhesion via plasma membrane adhesion molecules | The attachment of a plasma membrane adhesion molecule in one cell to an identical molecule in an adjacent cell. | 0.0037547 |
| GO:0043161 | proteasome-mediated ubiquitin-dependent protein catabolic process | The chemical reactions and pathways resulting in the breakdown of a protein or peptide by hydrolysis of its peptide bonds, initiated by the covalent attachment of ubiquitin, and mediated by the proteasome. | 0.0038846 |
| GO:0016579 | protein deubiquitination | The removal of one or more ubiquitin groups from a protein. | 0.0040033 |
| GO:0046039 | GTP metabolic process | The chemical reactions and pathways involving GTP, guanosine triphosphate. | 0.0041658 |
| GO:0046129 | purine ribonucleoside biosynthetic process | The chemical reactions and pathways resulting in the formation of any purine ribonucleoside, a nucleoside in which purine base is linked to a ribose (beta-D-ribofuranose) molecule. | 0.0041658 |
| GO:1901642 | nucleoside transmembrane transport | NA | 0.0041829 |
| GO:0006744 | ubiquinone biosynthetic process | The chemical reactions and pathways resulting in the formation of ubiquinone, a lipid-soluble electron-transporting coenzyme. | 0.0056367 |
| GO:0006241 | CTP biosynthetic process | The chemical reactions and pathways resulting in the formation of CTP, cytidine 5’-triphosphate. | 0.0088100 |
| GO:0016539 | intein-mediated protein splicing | The removal of an internal amino acid sequence (an intein) from a protein during protein maturation; the excision of inteins is precise and the N- and C-terminal exteins are joined by a normal peptide bond. Protein splicing involves 4 nucleophilic displacements by the 3 conserved splice junction residues. | 0.0096428 |
showSigOfNodes(GOdataVar.BP, score(BPvar.weight01),
firstSigNodes = sum(significant.elim),
wantedNodes = sigTerms)
## $dag
## A graphNEL graph with directed edges
## Number of Nodes = 327
## Number of Edges = 619
##
## $complete.dag
## [1] "A graph with 327 nodes."
Below I plot variance portion, but for the termo found most significant when using p-values, for comparison.
showGroupDensity(GOdataVar.BP, names(score(BP.weight01))[which.min(score(BP.weight01))])
orderedTerms <- names(sort(score(MFvar.weight01)))
significant.weight01 <- score(MFvar.weight01)[orderedTerms] <= 0.01
significant.lea <- score(MFvar.lea)[orderedTerms] <= 0.01
significant.elim <- score(MFvar.elim)[orderedTerms] <= 0.01
sigTerms <- orderedTerms[significant.weight01 & significant.lea & significant.elim]
# sigTerms gets updated, and is already used in the code, but I need a permanent copy for later.
MF.variance.sigTerms <- sigTerms
MFvar.all <- GenTable(GOdataVar.MF, elim=MFvar.elim, weight01=MFvar.weight01, lea=MFvar.lea,
orderBy="weight01", ranksOf="elim", topNodes=sum(significant.elim))
kable(
MFvar.all[MFvar.all$Significant > MFvar.all$Expected,],
caption = "Most over-represented terms of the Molecular Function ontology.")
| GO.ID | Term | Annotated | Significant | Expected | Rank in elim | elim | weight01 | lea | |
|---|---|---|---|---|---|---|---|---|---|
| 1 | GO:0005515 | protein binding | 2101 | 244 | 209.27 | 1 | 1.8e-21 | 3.9e-24 | 2.0e-22 |
| 2 | GO:0005509 | calcium ion binding | 361 | 69 | 35.96 | 2 | 2.9e-18 | 2.9e-18 | 2.9e-18 |
| 3 | GO:0003774 | motor activity | 131 | 21 | 13.05 | 3 | 3.5e-07 | 3.5e-07 | 5.1e-07 |
| 4 | GO:0004930 | G protein-coupled receptor activity | 180 | 21 | 17.93 | 4 | 7.2e-06 | 2.1e-05 | 2.0e-06 |
| 5 | GO:0005525 | GTP binding | 198 | 26 | 19.72 | 5 | 2.8e-05 | 2.8e-05 | 2.8e-05 |
| 6 | GO:0004672 | protein kinase activity | 274 | 37 | 27.29 | 7 | 0.00012 | 4.0e-05 | 0.00012 |
| 7 | GO:0005230 | extracellular ligand-gated ion channel a… | 54 | 10 | 5.38 | 8 | 0.00018 | 4.5e-05 | 0.00018 |
| 8 | GO:0022857 | transmembrane transporter activity | 583 | 94 | 58.07 | 29 | 0.00413 | 0.00015 | 2.8e-14 |
| 9 | GO:0005524 | ATP binding | 786 | 80 | 78.29 | 9 | 0.00020 | 0.00020 | 0.00020 |
| 10 | GO:0004888 | transmembrane signaling receptor activit… | 251 | 33 | 25.00 | 25 | 0.00236 | 0.00021 | 1.3e-07 |
| 11 | GO:0004181 | metallocarboxypeptidase activity | 19 | 7 | 1.89 | 11 | 0.00032 | 0.00032 | 0.00032 |
| 12 | GO:0005507 | copper ion binding | 18 | 5 | 1.79 | 13 | 0.00068 | 0.00068 | 0.00068 |
| 13 | GO:0004252 | serine-type endopeptidase activity | 88 | 22 | 8.77 | 16 | 0.00141 | 0.00141 | 0.00141 |
| 14 | GO:0005249 | voltage-gated potassium channel activity | 27 | 4 | 2.69 | 17 | 0.00142 | 0.00142 | 0.00142 |
| 15 | GO:0042626 | ATPase activity, coupled to transmembran… | 86 | 14 | 8.57 | 32 | 0.00474 | 0.00183 | 0.00474 |
| 16 | GO:0061630 | ubiquitin protein ligase activity | 27 | 3 | 2.69 | 21 | 0.00211 | 0.00211 | 0.00211 |
| 18 | GO:0005200 | structural constituent of cytoskeleton | 17 | 8 | 1.69 | 23 | 0.00216 | 0.00216 | 0.00216 |
| 19 | GO:0003924 | GTPase activity | 138 | 19 | 13.75 | 24 | 0.00233 | 0.00233 | 0.00233 |
| 20 | GO:0030246 | carbohydrate binding | 51 | 13 | 5.08 | 39 | 0.00624 | 0.00262 | 0.00031 |
| 21 | GO:0008061 | chitin binding | 77 | 12 | 7.67 | 26 | 0.00273 | 0.00273 | 0.00273 |
| 22 | GO:0036459 | thiol-dependent ubiquitinyl hydrolase ac… | 43 | 6 | 4.28 | 27 | 0.00293 | 0.00370 | 0.00293 |
| 23 | GO:0003777 | microtubule motor activity | 98 | 10 | 9.76 | 28 | 0.00410 | 0.00410 | 0.00410 |
| 25 | GO:0008047 | enzyme activator activity | 51 | 6 | 5.08 | 60 | 0.01276 | 0.00434 | 0.00239 |
| 26 | GO:0016840 | carbon-nitrogen lyase activity | 8 | 1 | 0.80 | 31 | 0.00461 | 0.00461 | 0.00461 |
| 27 | GO:0004550 | nucleoside diphosphate kinase activity | 5 | 1 | 0.50 | 33 | 0.00488 | 0.00488 | 0.00488 |
| 28 | GO:0005337 | nucleoside transmembrane transporter act… | 7 | 2 | 0.70 | 34 | 0.00491 | 0.00491 | 0.00491 |
| 29 | GO:0047631 | ADP-ribose diphosphatase activity | 6 | 2 | 0.60 | 35 | 0.00568 | 0.00568 | 0.00568 |
| 30 | GO:0004983 | neuropeptide Y receptor activity | 9 | 1 | 0.90 | 38 | 0.00620 | 0.00620 | 0.00620 |
| 31 | GO:0051015 | actin filament binding | 30 | 5 | 2.99 | 40 | 0.00624 | 0.00624 | 0.00624 |
| 32 | GO:0070577 | lysine-acetylated histone binding | 5 | 1 | 0.50 | 41 | 0.00687 | 0.00687 | 0.00687 |
| 33 | GO:0016504 | peptidase activator activity | 5 | 1 | 0.50 | 42 | 0.00687 | 0.00687 | 0.00687 |
| 34 | GO:0070628 | proteasome binding | 5 | 1 | 0.50 | 43 | 0.00687 | 0.00687 | 0.00687 |
| 35 | GO:0070006 | metalloaminopeptidase activity | 5 | 2 | 0.50 | 46 | 0.00750 | 0.00750 | 0.00750 |
| 36 | GO:0016787 | hydrolase activity | 1382 | 215 | 137.65 | 14 | 0.00117 | 0.00766 | 2.7e-17 |
| 37 | GO:0016772 | transferase activity, transferring phosp… | 459 | 53 | 45.72 | 96 | 0.02951 | 0.00800 | 6.6e-06 |
| 38 | GO:0005328 | neurotransmitter:sodium symporter activi… | 18 | 5 | 1.79 | 48 | 0.00861 | 0.00861 | 0.00861 |
| 39 | GO:0004198 | calcium-dependent cysteine-type endopept… | 28 | 4 | 2.79 | 51 | 0.00901 | 0.00901 | 0.00901 |
| 41 | GO:0005267 | potassium channel activity | 42 | 9 | 4.18 | 55 | 0.01126 | 0.01126 | 7.9e-05 |
| 43 | GO:0015293 | symporter activity | 32 | 6 | 3.19 | 81 | 0.02507 | 0.01227 | 0.00184 |
| 44 | GO:0003700 | DNA-binding transcription factor activit… | 158 | 17 | 15.74 | 95 | 0.02897 | 0.01307 | 0.02897 |
| 45 | GO:0030248 | cellulose binding | 7 | 1 | 0.70 | 62 | 0.01386 | 0.01386 | 0.01386 |
| 46 | GO:0004867 | serine-type endopeptidase inhibitor acti… | 9 | 2 | 0.90 | 63 | 0.01419 | 0.01419 | 0.01419 |
| 47 | GO:0008092 | cytoskeletal protein binding | 144 | 19 | 14.34 | 71 | 0.01797 | 0.01517 | 0.00059 |
| 48 | GO:0022848 | acetylcholine-gated cation-selective cha… | 16 | 2 | 1.59 | 65 | 0.01636 | 0.01636 | 0.01636 |
| 49 | GO:0004096 | catalase activity | 15 | 2 | 1.49 | 70 | 0.01677 | 0.01677 | 0.01677 |
| 50 | GO:0020037 | heme binding | 95 | 17 | 9.46 | 72 | 0.01859 | 0.01859 | 0.01859 |
| 52 | GO:0015081 | sodium ion transmembrane transporter act… | 101 | 17 | 10.06 | 304 | 0.27297 | 0.02355 | 0.27297 |
| 53 | GO:0015491 | cation:cation antiporter activity | 10 | 4 | 1.00 | 53 | 0.00920 | 0.02363 | 0.00920 |
kable(
MFvar.all[MFvar.all$Significant < MFvar.all$Expected,],
caption = "Most under-represented terms of the Molecular Function ontology.")
| GO.ID | Term | Annotated | Significant | Expected | Rank in elim | elim | weight01 | lea | |
|---|---|---|---|---|---|---|---|---|---|
| 17 | GO:0043565 | sequence-specific DNA binding | 146 | 11 | 14.54 | 52 | 0.00907 | 0.00213 | 0.00907 |
| 24 | GO:0003735 | structural constituent of ribosome | 113 | 2 | 11.26 | 30 | 0.00432 | 0.00432 | 0.00432 |
| 40 | GO:0008017 | microtubule binding | 62 | 6 | 6.18 | 54 | 0.01005 | 0.01005 | 0.01005 |
| 42 | GO:0004298 | threonine-type endopeptidase activity | 15 | 0 | 1.49 | 57 | 0.01164 | 0.01164 | 0.01164 |
| 51 | GO:0003712 | transcription coregulator activity | 42 | 3 | 4.18 | 44 | 0.00730 | 0.02015 | 0.00730 |
vennDiagram(vennCounts(cbind(weight01=significant.weight01,
lea=significant.lea,
elim=significant.elim)))
kable(data.frame(Term=Term(GOTERM[sigTerms]),
Definition=Definition(GOTERM[sigTerms]),
PValue=score(MFvar.weight01)[sigTerms]),
caption = paste('Molecular functions terms significantly associated with', VAR,
'according to all 3 algorithms', sep=' '))
| Term | Definition | PValue | |
|---|---|---|---|
| GO:0005515 | protein binding | Interacting selectively and non-covalently with any protein or protein complex (a complex of two or more proteins that may include other nonprotein molecules). | 0.0000000 |
| GO:0005509 | calcium ion binding | Interacting selectively and non-covalently with calcium ions (Ca2+). | 0.0000000 |
| GO:0003774 | motor activity | Catalysis of the generation of force resulting either in movement along a microfilament or microtubule, or in torque resulting in membrane scission, coupled to the hydrolysis of a nucleoside triphosphate. | 0.0000004 |
| GO:0004930 | G protein-coupled receptor activity | Combining with an extracellular signal and transmitting the signal across the membrane by activating an associated G-protein; promotes the exchange of GDP for GTP on the alpha subunit of a heterotrimeric G-protein complex. | 0.0000206 |
| GO:0005525 | GTP binding | Interacting selectively and non-covalently with GTP, guanosine triphosphate. | 0.0000283 |
| GO:0004672 | protein kinase activity | Catalysis of the phosphorylation of an amino acid residue in a protein, usually according to the reaction: a protein + ATP = a phosphoprotein + ADP. | 0.0000399 |
| GO:0005230 | extracellular ligand-gated ion channel activity | Enables the transmembrane transfer of an ion by a channel that opens when a specific extracellular ligand has been bound by the channel complex or one of its constituent parts. | 0.0000446 |
| GO:0022857 | transmembrane transporter activity | Enables the transfer of a substance, usually a specific substance or a group of related substances, from one side of a membrane to the other. | 0.0001541 |
| GO:0005524 | ATP binding | Interacting selectively and non-covalently with ATP, adenosine 5’-triphosphate, a universally important coenzyme and enzyme regulator. | 0.0001971 |
| GO:0004888 | transmembrane signaling receptor activity | Combining with an extracellular or intracellular signal and transmitting the signal from one side of the membrane to the other to initiate a change in cell activity or state as part of signal transduction. | 0.0002077 |
| GO:0004181 | metallocarboxypeptidase activity | Catalysis of the hydrolysis of C-terminal amino acid residues from a polypeptide chain by a mechanism in which water acts as a nucleophile, one or two metal ions hold the water molecule in place, and charged amino acid side chains are ligands for the metal ions. | 0.0003201 |
| GO:0005507 | copper ion binding | Interacting selectively and non-covalently with copper (Cu) ions. | 0.0006774 |
| GO:0004252 | serine-type endopeptidase activity | Catalysis of the hydrolysis of internal, alpha-peptide bonds in a polypeptide chain by a catalytic mechanism that involves a catalytic triad consisting of a serine nucleophile that is activated by a proton relay involving an acidic residue (e.g. aspartate or glutamate) and a basic residue (usually histidine). | 0.0014114 |
| GO:0005249 | voltage-gated potassium channel activity | Enables the transmembrane transfer of a potassium ion by a voltage-gated channel. A voltage-gated channel is a channel whose open state is dependent on the voltage across the membrane in which it is embedded. | 0.0014195 |
| GO:0042626 | ATPase activity, coupled to transmembrane movement of substances | Catalysis of the reaction: ATP + H2O = ADP + phosphate, to directly drive the active transport of a substance across a membrane. | 0.0018293 |
| GO:0061630 | ubiquitin protein ligase activity | Catalysis of the transfer of ubiquitin to a substrate protein via the reaction X-ubiquitin + S -> X + S-ubiquitin, where X is either an E2 or E3 enzyme, the X-ubiquitin linkage is a thioester bond, and the S-ubiquitin linkage is an amide bond: an isopeptide bond between the C-terminal glycine of ubiquitin and the epsilon-amino group of lysine residues in the substrate or, in the linear extension of ubiquitin chains, a peptide bond the between the C-terminal glycine and N-terminal methionine of ubiquitin residues. | 0.0021109 |
| GO:0043565 | sequence-specific DNA binding | Interacting selectively and non-covalently with DNA of a specific nucleotide composition, e.g. GC-rich DNA binding, or with a specific sequence motif or type of DNA e.g. promotor binding or rDNA binding. | 0.0021341 |
| GO:0005200 | structural constituent of cytoskeleton | The action of a molecule that contributes to the structural integrity of a cytoskeletal structure. | 0.0021609 |
| GO:0003924 | GTPase activity | Catalysis of the reaction: GTP + H2O = GDP + phosphate. | 0.0023264 |
| GO:0030246 | carbohydrate binding | Interacting selectively and non-covalently with any carbohydrate, which includes monosaccharides, oligosaccharides and polysaccharides as well as substances derived from monosaccharides by reduction of the carbonyl group (alditols), by oxidation of one or more hydroxy groups to afford the corresponding aldehydes, ketones, or carboxylic acids, or by replacement of one or more hydroxy group(s) by a hydrogen atom. Cyclitols are generally not regarded as carbohydrates. | 0.0026230 |
| GO:0008061 | chitin binding | Interacting selectively and non-covalently with chitin, a linear polysaccharide consisting of beta-(1->4)-linked N-acetyl-D-glucosamine residues. | 0.0027273 |
| GO:0036459 | thiol-dependent ubiquitinyl hydrolase activity | Catalysis of the thiol-dependent hydrolysis of an ester, thioester, amide, peptide or isopeptide bond formed by the C-terminal glycine of ubiquitin. | 0.0036965 |
| GO:0003777 | microtubule motor activity | Catalysis of movement along a microtubule, coupled to the hydrolysis of a nucleoside triphosphate (usually ATP). | 0.0041032 |
| GO:0003735 | structural constituent of ribosome | The action of a molecule that contributes to the structural integrity of the ribosome. | 0.0043247 |
| GO:0016840 | carbon-nitrogen lyase activity | Catalysis of the release of ammonia or one of its derivatives, with the formation of a double bond or ring. Enzymes with this activity may catalyze the actual elimination of the ammonia, amine or amide, e.g. CH-CH(-NH-R) = C=CH- + NH2-R. Others, however, catalyze elimination of another component, e.g. water, which is followed by spontaneous reactions that lead to breakage of the C-N bond, e.g. L-serine ammonia-lyase (EC:4.3.1.17), so that the overall reaction is C(-OH)-CH(-NH2) = CH2-CO- + NH3, i.e. an elimination with rearrangement. The sub-subclasses of EC:4.3 are the ammonia-lyases (EC:4.3.1), lyases acting on amides, amidines, etc. (EC:4.3.2), the amine-lyases (EC:4.3.3), and other carbon-nitrogen lyases (EC:4.3.99). | 0.0046139 |
| GO:0004550 | nucleoside diphosphate kinase activity | Catalysis of the reaction: ATP + nucleoside diphosphate = ADP + nucleoside triphosphate. | 0.0048829 |
| GO:0005337 | nucleoside transmembrane transporter activity | Enables the transfer of a nucleoside, a nucleobase linked to either beta-D-ribofuranose (ribonucleoside) or 2-deoxy-beta-D-ribofuranose, (a deoxyribonucleotide) from one side of a membrane to the other. | 0.0049056 |
| GO:0047631 | ADP-ribose diphosphatase activity | Catalysis of the reaction: ADP-ribose + H2O = AMP + D-ribose 5-phosphate. | 0.0056840 |
| GO:0004983 | neuropeptide Y receptor activity | Combining with neuropeptide Y to initiate a change in cell activity. | 0.0061951 |
| GO:0051015 | actin filament binding | Interacting selectively and non-covalently with an actin filament, also known as F-actin, a helical filamentous polymer of globular G-actin subunits. | 0.0062409 |
| GO:0070577 | lysine-acetylated histone binding | Interacting selectively and non-covalently with a histone in which a lysine residue has been modified by acetylation. | 0.0068670 |
| GO:0016504 | peptidase activator activity | Binds to and increases the activity of a peptidase, any enzyme that catalyzes the hydrolysis peptide bonds. | 0.0068670 |
| GO:0070628 | proteasome binding | Interacting selectively and non-covalently with a proteasome, a large multisubunit protein complex that catalyzes protein degradation. | 0.0068670 |
| GO:0070006 | metalloaminopeptidase activity | Catalysis of the hydrolysis of N-terminal amino acid residues from a polypeptide chain by a mechanism in which water acts as a nucleophile, one or two metal ions hold the water molecule in place, and charged amino acid side chains are ligands for the metal ions. | 0.0075027 |
| GO:0016787 | hydrolase activity | Catalysis of the hydrolysis of various bonds, e.g. C-O, C-N, C-C, phosphoric anhydride bonds, etc. Hydrolase is the systematic name for any enzyme of EC class 3. | 0.0076576 |
| GO:0005328 | neurotransmitter:sodium symporter activity | Enables the transfer of a solute or solutes from one side of a membrane to the other according to the reaction: neurotransmitter(out) + Na+(out) = neurotransmitter(in) + Na+(in). | 0.0086099 |
| GO:0004198 | calcium-dependent cysteine-type endopeptidase activity | Catalysis of the hydrolysis of nonterminal peptide bonds in a polypeptide chain by a mechanism using a cysteine residue at the enzyme active center, and requiring the presence of calcium. | 0.0090130 |
showSigOfNodes(GOdataVar.MF, score(MFvar.weight01),
firstSigNodes = sum(significant.elim),
wantedNodes = sigTerms)
## $dag
## A graphNEL graph with directed edges
## Number of Nodes = 182
## Number of Edges = 250
##
## $complete.dag
## [1] "A graph with 182 nodes."
I plot variance portion, but for the term found most significant when using p-values, for comparison.
showGroupDensity(GOdataVar.MF, names(score(MF.weight01))[which.min(score(MF.weight01))])
orderedTerms <- names(sort(score(CCvar.weight01)))
significant.weight01 <- score(CCvar.weight01)[orderedTerms] <= 0.01
significant.lea <- score(CCvar.lea)[orderedTerms] <= 0.01
significant.elim <- score(CCvar.elim)[orderedTerms] <= 0.01
sigTerms <- orderedTerms[significant.weight01 & significant.lea & significant.elim]
if (length(sigTerms) == 0) {
sigTerms <- orderedTerms[significant.lea & significant.elim]
}
# sigTerms gets updated, and is already used in the code, but I need a permanent copy for later.
CC.variance.sigTerms <- sigTerms
CCvar.all <- GenTable(GOdataVar.CC, elim=CCvar.elim, weight01=CCvar.weight01, lea=CCvar.lea,
orderBy="weight01", ranksOf="elim", topNodes=max(sum(significant.elim), 10))
kable(
CCvar.all[CCvar.all$Significant > CCvar.all$Expected,],
caption = "Most over-represented terms of the Cellular Component ontology.")
| GO.ID | Term | Annotated | Significant | Expected | Rank in elim | elim | weight01 | lea | |
|---|---|---|---|---|---|---|---|---|---|
| 1 | GO:0016021 | integral component of membrane | 885 | 133 | 80.29 | 1 | 5.7e-23 | 2.4e-21 | 5.9e-28 |
| 2 | GO:0016020 | membrane | 1565 | 211 | 141.99 | 2 | 1.6e-10 | 3.7e-10 | < 1e-30 |
| 3 | GO:0016459 | myosin complex | 33 | 11 | 2.99 | 4 | 1.2e-07 | 1.2e-07 | 1.2e-07 |
| 4 | GO:0005576 | extracellular region | 140 | 25 | 12.70 | 3 | 7.7e-08 | 2.7e-06 | 7.7e-08 |
| 5 | GO:0008076 | voltage-gated potassium channel complex | 13 | 2 | 1.18 | 5 | 6.6e-05 | 6.6e-05 | 6.6e-05 |
| 6 | GO:0005887 | integral component of plasma membrane | 118 | 19 | 10.71 | 6 | 1.0e-04 | 0.00023 | 6.8e-07 |
| 8 | GO:0005886 | plasma membrane | 170 | 26 | 15.42 | 8 | 0.00022 | 0.00059 | 5.7e-11 |
| 10 | GO:0090575 | RNA polymerase II transcription factor c… | 23 | 6 | 2.09 | 7 | 0.00020 | 0.00258 | 0.00020 |
| 11 | GO:0098797 | plasma membrane protein complex | 34 | 5 | 3.08 | 30 | 0.01397 | 0.00313 | 5.9e-06 |
| 12 | GO:0005929 | cilium | 27 | 7 | 2.45 | 11 | 0.00072 | 0.00380 | 1.5e-05 |
| 14 | GO:0034704 | calcium channel complex | 5 | 1 | 0.45 | 21 | 0.00622 | 0.00622 | 0.00622 |
| 15 | GO:0045211 | postsynaptic membrane | 16 | 2 | 1.45 | 28 | 0.00906 | 0.00906 | 0.00906 |
| 17 | GO:0005743 | mitochondrial inner membrane | 33 | 5 | 2.99 | 16 | 0.00153 | 0.01470 | 0.00153 |
| 19 | GO:0031012 | extracellular matrix | 9 | 2 | 0.82 | 37 | 0.02018 | 0.02018 | 0.02018 |
| 20 | GO:0008023 | transcription elongation factor complex | 10 | 2 | 0.91 | 38 | 0.02241 | 0.02241 | 0.02241 |
| 21 | GO:0005874 | microtubule | 29 | 7 | 2.63 | 42 | 0.02318 | 0.02318 | 0.02318 |
| 22 | GO:0016591 | RNA polymerase II, holoenzyme | 20 | 3 | 1.81 | 17 | 0.00265 | 0.02524 | 0.00265 |
| 23 | GO:0044441 | ciliary part | 21 | 5 | 1.91 | 39 | 0.02247 | 0.02644 | 0.00047 |
| 24 | GO:0005858 | axonemal dynein complex | 5 | 1 | 0.45 | 47 | 0.02762 | 0.02762 | 0.02762 |
| 26 | GO:0099512 | supramolecular fiber | 31 | 9 | 2.81 | 27 | 0.00857 | 0.03429 | 0.00857 |
| 28 | GO:0005669 | transcription factor TFIID complex | 7 | 1 | 0.64 | 52 | 0.03784 | 0.03784 | 0.03784 |
kable(
CCvar.all[CCvar.all$Significant < CCvar.all$Expected,],
caption = "Most under-represented terms of the Cellular Component ontology.")
| GO.ID | Term | Annotated | Significant | Expected | Rank in elim | elim | weight01 | lea | |
|---|---|---|---|---|---|---|---|---|---|
| 7 | GO:0005840 | ribosome | 115 | 3 | 10.43 | 12 | 0.00083 | 0.00038 | 0.00083 |
| 9 | GO:0005681 | spliceosomal complex | 12 | 1 | 1.09 | 15 | 0.00145 | 0.00145 | 0.00145 |
| 13 | GO:0005634 | nucleus | 531 | 48 | 48.18 | 13 | 0.00102 | 0.00382 | 2.0e-07 |
| 16 | GO:0098800 | inner mitochondrial membrane protein com… | 21 | 1 | 1.91 | 56 | 0.03838 | 0.00996 | 0.03838 |
| 18 | GO:0016592 | mediator complex | 24 | 1 | 2.18 | 35 | 0.01870 | 0.01870 | 0.01870 |
| 25 | GO:0000124 | SAGA complex | 5 | 0 | 0.45 | 49 | 0.03003 | 0.03003 | 0.03003 |
| 27 | GO:0031461 | cullin-RING ubiquitin ligase complex | 15 | 0 | 1.36 | 93 | 0.10223 | 0.03779 | 0.10223 |
| 29 | GO:0005802 | trans-Golgi network | 6 | 0 | 0.54 | 57 | 0.03860 | 0.03860 | 0.03860 |
vennDiagram(vennCounts(cbind(weight01=significant.weight01,
lea=significant.lea,
elim=significant.elim)))
kable(data.frame(Term=Term(GOTERM[sigTerms]),
Definition=Definition(GOTERM[sigTerms]),
PValue=score(CCvar.weight01)[sigTerms]),
caption = paste('Cellular component terms significantly associated with', VAR,
'according to all 3 algorithms', sep=' '))
| Term | Definition | PValue | |
|---|---|---|---|
| GO:0016021 | integral component of membrane | The component of a membrane consisting of the gene products and protein complexes having at least some part of their peptide sequence embedded in the hydrophobic region of the membrane. | 0.0000000 |
| GO:0016020 | membrane | A lipid bilayer along with all the proteins and protein complexes embedded in it an attached to it. | 0.0000000 |
| GO:0016459 | myosin complex | A protein complex, formed of one or more myosin heavy chains plus associated light chains and other proteins, that functions as a molecular motor; uses the energy of ATP hydrolysis to move actin filaments or to move vesicles or other cargo on fixed actin filaments; has magnesium-ATPase activity and binds actin. Myosin classes are distinguished based on sequence features of the motor, or head, domain, but also have distinct tail regions that are believed to bind specific cargoes. | 0.0000001 |
| GO:0005576 | extracellular region | The space external to the outermost structure of a cell. For cells without external protective or external encapsulating structures this refers to space outside of the plasma membrane. This term covers the host cell environment outside an intracellular parasite. | 0.0000027 |
| GO:0008076 | voltage-gated potassium channel complex | A protein complex that forms a transmembrane channel through which potassium ions may cross a cell membrane in response to changes in membrane potential. | 0.0000665 |
| GO:0005887 | integral component of plasma membrane | The component of the plasma membrane consisting of the gene products and protein complexes having at least some part of their peptide sequence embedded in the hydrophobic region of the membrane. | 0.0002347 |
| GO:0005840 | ribosome | An intracellular organelle, about 200 A in diameter, consisting of RNA and protein. It is the site of protein biosynthesis resulting from translation of messenger RNA (mRNA). It consists of two subunits, one large and one small, each containing only protein and RNA. Both the ribosome and its subunits are characterized by their sedimentation coefficients, expressed in Svedberg units (symbol: S). Hence, the prokaryotic ribosome (70S) comprises a large (50S) subunit and a small (30S) subunit, while the eukaryotic ribosome (80S) comprises a large (60S) subunit and a small (40S) subunit. Two sites on the ribosomal large subunit are involved in translation, namely the aminoacyl site (A site) and peptidyl site (P site). Ribosomes from prokaryotes, eukaryotes, mitochondria, and chloroplasts have characteristically distinct ribosomal proteins. | 0.0003808 |
| GO:0005886 | plasma membrane | The membrane surrounding a cell that separates the cell from its external environment. It consists of a phospholipid bilayer and associated proteins. | 0.0005897 |
| GO:0005681 | spliceosomal complex | Any of a series of ribonucleoprotein complexes that contain snRNA(s) and small nuclear ribonucleoproteins (snRNPs), and are formed sequentially during the spliceosomal splicing of one or more substrate RNAs, and which also contain the RNA substrate(s) from the initial target RNAs of splicing, the splicing intermediate RNA(s), to the final RNA products. During cis-splicing, the initial target RNA is a single, contiguous RNA transcript, whether mRNA, snoRNA, etc., and the released products are a spliced RNA and an excised intron, generally as a lariat structure. During trans-splicing, there are two initial substrate RNAs, the spliced leader RNA and a pre-mRNA. | 0.0014522 |
| GO:0090575 | RNA polymerase II transcription factor complex | A transcription factor complex that acts at a regulatory region of a gene transcribed by RNA polymerase II. | 0.0025830 |
| GO:0005929 | cilium | A specialized eukaryotic organelle that consists of a filiform extrusion of the cell surface and of some cytoplasmic parts. Each cilium is largely bounded by an extrusion of the cytoplasmic (plasma) membrane, and contains a regular longitudinal array of microtubules, anchored to a basal body. | 0.0037999 |
| GO:0005634 | nucleus | A membrane-bounded organelle of eukaryotic cells in which chromosomes are housed and replicated. In most cells, the nucleus contains all of the cell’s chromosomes except the organellar chromosomes, and is the site of RNA synthesis and processing. In some species, or in specialized cell types, RNA metabolism or DNA replication may be absent. | 0.0038217 |
| GO:0034704 | calcium channel complex | An ion channel complex through which calcium ions pass. | 0.0062205 |
| GO:0045211 | postsynaptic membrane | A specialized area of membrane facing the presynaptic membrane on the tip of the nerve ending and separated from it by a minute cleft (the synaptic cleft). Neurotransmitters cross the synaptic cleft and transmit the signal to the postsynaptic membrane. | 0.0090582 |
showSigOfNodes(GOdataVar.CC, score(CCvar.weight01),
firstSigNodes = sum(significant.weight01),
wantedNodes = sigTerms)
## $dag
## A graphNEL graph with directed edges
## Number of Nodes = 68
## Number of Edges = 123
##
## $complete.dag
## [1] "A graph with 68 nodes."
For comparison, I plot the distribution of variance portion for the CC term found most significant when using p-values.
showGroupDensity(GOdataVar.CC, names(score(CC.weight01))[which.min(score(CC.weight01))])
allTerms <- usedGO(GOdata.BP)
vennDiagram(vennCounts(cbind(PValue = allTerms %in% BP.pvalue.sigTerms,
Variance = allTerms %in% BP.variance.sigTerms)))
ggplot(data = data.frame(PValue = rank(score(BP.weight01))[allTerms],
Variance = rank(score(BPvar.weight01))[allTerms]),
mapping = aes(x = PValue, y = Variance)) +
geom_point() + geom_smooth() + xlab('Using gene p-values') +
ylab('Using portion of explained variance') +
ggtitle('Ordering of BP terms by significance')
allTerms <- usedGO(GOdata.MF)
vennDiagram(vennCounts(cbind(PValue = allTerms %in% MF.pvalue.sigTerms,
Variance = allTerms %in% MF.variance.sigTerms)))
ggplot(data = data.frame(PValue = rank(score(MF.weight01))[allTerms],
Variance = rank(score(MFvar.weight01))[allTerms]),
mapping = aes(x = PValue, y = Variance)) +
geom_point() + geom_smooth() + xlab('Using gene p-values') +
ylab('Using portion of explained variance') +
ggtitle('Ordering of MF terms by significance')
allTerms <- usedGO(GOdata.CC)
vennDiagram(vennCounts(cbind(PValue = allTerms %in% CC.pvalue.sigTerms,
Variance = allTerms %in% CC.variance.sigTerms)))
ggplot(data = data.frame(PValue = rank(score(CC.weight01))[allTerms],
Variance = rank(score(CCvar.weight01))[allTerms]),
mapping = aes(x = PValue, y = Variance)) +
geom_point() + geom_smooth() + xlab('Using gene p-values') +
ylab('Using portion of explained variance') +
ggtitle('Ordering of CC terms by significance')
I save the main variables to be able to load them in a new R session later.
save(allgenes2GO,
GOdata.BP, BP.elim, BP.weight01, BP.lea, BP.pvalue.sigTerms,
GOdata.MF, MF.elim, MF.weight01, MF.lea, MF.pvalue.sigTerms,
GOdata.CC, CC.elim, CC.weight01, CC.lea, CC.pvalue.sigTerms,
GOdataVar.BP, BPvar.elim, BPvar.weight01, BPvar.lea, BP.variance.sigTerms,
GOdataVar.MF, MFvar.elim, MFvar.weight01, MFvar.lea, MF.variance.sigTerms,
GOdataVar.CC, CCvar.elim, CCvar.weight01, CCvar.lea, CC.variance.sigTerms,
file = paste('Enrichment', TAG, VAR, 'RData', sep='.'))
sessionInfo()
## R version 3.6.2 (2019-12-12)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.3 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
##
## locale:
## [1] LC_CTYPE=ca_ES.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=es_ES.UTF-8 LC_COLLATE=ca_ES.UTF-8
## [5] LC_MONETARY=es_ES.UTF-8 LC_MESSAGES=ca_ES.UTF-8
## [7] LC_PAPER=es_ES.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] grid stats4 parallel stats graphics grDevices utils
## [8] datasets methods base
##
## other attached packages:
## [1] Rgraphviz_2.30.0 ggplot2_3.2.1 limma_3.42.0
## [4] knitr_1.27 topGO_2.38.1 SparseM_1.78
## [7] GO.db_3.10.0 AnnotationDbi_1.48.0 IRanges_2.20.2
## [10] S4Vectors_0.24.3 Biobase_2.46.0 graph_1.64.0
## [13] BiocGenerics_0.32.0
##
## loaded via a namespace (and not attached):
## [1] tidyselect_0.2.5 xfun_0.12 purrr_0.3.3 splines_3.6.2
## [5] lattice_0.20-38 colorspace_1.4-1 vctrs_0.2.1 htmltools_0.4.0
## [9] yaml_2.2.0 mgcv_1.8-31 blob_1.2.1 rlang_0.4.2
## [13] pillar_1.4.3 glue_1.3.1 withr_2.1.2 DBI_1.1.0
## [17] bit64_0.9-7 matrixStats_0.55.0 lifecycle_0.1.0 stringr_1.4.0
## [21] munsell_0.5.0 gtable_0.3.0 memoise_1.1.0 evaluate_0.14
## [25] labeling_0.3 highr_0.8 Rcpp_1.0.3 backports_1.1.5
## [29] scales_1.1.0 farver_2.0.3 bit_1.1-15.1 digest_0.6.23
## [33] stringi_1.4.5 dplyr_0.8.3 tools_3.6.2 magrittr_1.5
## [37] lazyeval_0.2.2 tibble_2.1.3 RSQLite_2.2.0 crayon_1.3.4
## [41] pkgconfig_2.0.3 zeallot_0.1.0 Matrix_1.2-18 assertthat_0.2.1
## [45] rmarkdown_2.1 R6_2.4.1 nlme_3.1-143 compiler_3.6.2